Goto

Collaborating Authors

 original method




Enhanced Robotic Navigation in Deformable Environments using Learning from Demonstration and Dynamic Modulation

Chen, Lingyun, Zhao, Xinrui, Campanha, Marcos P. S., Wegener, Alexander, Naceri, Abdeldjallil, Swikir, Abdalla, Haddadin, Sami

arXiv.org Artificial Intelligence

-- This paper presents a novel approach for robot navigation in environments containing deformable obstacles. By integrating Learning from Demonstration (LfD) with Dynamical Systems (DS), we enable adaptive and efficient navigation in complex environments where obstacles consist of both soft and hard regions. We introduce a dynamic modulation matrix within the DS framework, allowing the system to distinguish between traversable soft regions and impassable hard areas in real-time, ensuring safe and flexible trajectory planning. We validate our method through extensive simulations and robot experiments, demonstrating its ability to navigate deformable environments. Additionally, the approach provides control over both trajectory and velocity when interacting with deformable objects, including at intersections, while maintaining adherence to the original DS trajectory and dynamically adapting to obstacles for smooth and reliable navigation. Navigating complex environments remains a key challenge in robotics, particularly in scenarios requiring enhanced decision-making and adaptability. While most current research emphasizes obstacle avoidance by treating all obstacles as rigid entities to be avoided entirely [1]-[3], relatively little attention has been given to environments containing deformable regions that could be incorporated into the robot's path planning.


Evaluating the Effectiveness of LLMs in Fixing Maintainability Issues in Real-World Projects

Nunes, Henrique, Figueiredo, Eduardo, Rocha, Larissa, Nadi, Sarah, Ferreira, Fischer, Esteves, Geanderson

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have gained attention for addressing coding problems, but their effectiveness in fixing code maintainability remains unclear. This study evaluates LLMs capability to resolve 127 maintainability issues from 10 GitHub repositories. We use zero-shot prompting for Copilot Chat and Llama 3.1, and few-shot prompting with Llama only. The LLM-generated solutions are assessed for compilation errors, test failures, and new maintainability problems. Llama with few-shot prompting successfully fixed 44.9% of the methods, while Copilot Chat and Llama zero-shot fixed 32.29% and 30%, respectively. However, most solutions introduced errors or new maintainability issues. We also conducted a human study with 45 participants to evaluate the readability of 51 LLM-generated solutions. The human study showed that 68.63% of participants observed improved readability. Overall, while LLMs show potential for fixing maintainability issues, their introduction of errors highlights their current limitations.


Enhancement of price trend trading strategies via image-induced importance weights

Zhu, Zhoufan, Zhu, Ke

arXiv.org Machine Learning

We open up the "black-box" to identify the predictive general price patterns in price chart images via the deep learning image analysis techniques. Our identified price patterns lead to the construction of image-induced importance (triple-I) weights, which are applied to weighted moving average the existing price trend trading signals according to their level of importance in predicting price movements. From an extensive empirical analysis on the Chinese stock market, we show that the triple-I weighting scheme can significantly enhance the price trend trading signals for proposing portfolios, with a thoughtful robustness study in terms of network specifications, image structures, and stock sizes. Moreover, we demonstrate that the triple-I weighting scheme is able to propose long-term portfolios from a time-scale transfer learning, enhance the news-based trading strategies through a non-technical transfer learning, and increase the overall strength of numerous trading rules for portfolio selection.


Robust Multi-Robot Global Localization with Unknown Initial Pose based on Neighbor Constraints

Zhang, Yaojie, Luo, Haowen, Wang, Weijun, Feng, Wei

arXiv.org Artificial Intelligence

Multi-robot global localization (MR-GL) with unknown initial positions in a large scale environment is a challenging task. The key point is the data association between different robots' viewpoints. It also makes traditional Appearance-based localization methods unusable. Recently, researchers have utilized the object's semantic invariance to generate a semantic graph to address this issue. However, previous works lack robustness and are sensitive to overlap rate of maps, resulting in unpredictable performance in real-world environments. In this paper, we propose a data association algorithm based on neighbor constraints to improve the robustness of the system. We demonstrate the effectiveness of our method on three different datasets, indicating a significant improvement in robustness compared to previous works.


Inspecting Explainability of Transformer Models with Additional Statistical Information

Nguyen, Hoang C., Lee, Haeil, Kim, Junmo

arXiv.org Artificial Intelligence

Transformer becomes more popular in the vision domain in recent years so there is a need for finding an effective way to interpret the Transformer model by visualizing it. In recent work, Chefer et al. can visualize the Transformer on vision and multi-modal tasks effectively by combining attention layers to show the importance of each image patch. However, when applying to other variants of Transformer such as the Swin Transformer, this method can not focus on the predicted object. Our method, by considering the statistics of tokens in layer normalization layers, shows a great ability to interpret the explainability of Swin Transformer and ViT.


Memory Efficient Diffusion Probabilistic Models via Patch-based Generation

Arakawa, Shinei, Tsunashima, Hideki, Horita, Daichi, Tanaka, Keitaro, Morishima, Shigeo

arXiv.org Artificial Intelligence

Diffusion probabilistic models have been successful in generating high-quality and diverse images. However, traditional models, whose input and output are high-resolution images, suffer from excessive memory requirements, making them less practical for edge devices. Previous approaches for generative adversarial networks proposed a patch-based method that uses positional encoding and global content information. Nevertheless, designing a patch-based approach for diffusion probabilistic models is non-trivial. In this paper, we resent a diffusion probabilistic model that generates images on a patch-by-patch basis. We propose two conditioning methods for a patch-based generation. First, we propose position-wise conditioning using one-hot representation to ensure patches are in proper positions. Second, we propose Global Content Conditioning (GCC) to ensure patches have coherent content when concatenated together. We evaluate our model qualitatively and quantitatively on CelebA and LSUN bedroom datasets and demonstrate a moderate trade-off between maximum memory consumption and generated image quality. Specifically, when an entire image is divided into 2 x 2 patches, our proposed approach can reduce the maximum memory consumption by half while maintaining comparable image quality.


Fast Regularized Discrete Optimal Transport with Group-Sparse Regularizers

Ida, Yasutoshi, Kanai, Sekitoshi, Adachi, Kazuki, Kumagai, Atsutoshi, Fujiwara, Yasuhiro

arXiv.org Artificial Intelligence

Regularized discrete optimal transport (OT) is a powerful tool to measure the distance between two discrete distributions that have been constructed from data samples on two different domains. While it has a wide range of applications in machine learning, in some cases the sampled data from only one of the domains will have class labels such as unsupervised domain adaptation. In this kind of problem setting, a group-sparse regularizer is frequently leveraged as a regularization term to handle class labels. In particular, it can preserve the label structure on the data samples by corresponding the data samples with the same class label to one group-sparse regularization term. As a result, we can measure the distance while utilizing label information by solving the regularized optimization problem with gradient-based algorithms. However, the gradient computation is expensive when the number of classes or data samples is large because the number of regularization terms and their respective sizes also turn out to be large. This paper proposes fast discrete OT with group-sparse regularizers. Our method is based on two ideas. The first is to safely skip the computations of the gradients that must be zero. The second is to efficiently extract the gradients that are expected to be nonzero. Our method is guaranteed to return the same value of the objective function as that of the original method. Experiments show that our method is up to 8.6 times faster than the original method without degrading accuracy.


Demystifying What Code Summarization Models Learned

Wang, Yu, Wang, Ke

arXiv.org Artificial Intelligence

Study patterns that models have learned has long been a focus of pattern recognition research. Explaining what patterns are discovered from training data, and how patterns are generalized to unseen data are instrumental to understanding and advancing the pattern recognition methods. Unfortunately, the vast majority of the application domains deal with continuous data (i.e. statistical in nature) out of which extracted patterns can not be formally defined. For example, in image classification, there does not exist a principle definition for a label of cat or dog. Even in natural language, the meaning of a word can vary with the context it is surrounded by. Unlike the aforementioned data format, programs are a unique data structure with a well-defined syntax and semantics, which creates a golden opportunity to formalize what models have learned from source code. This paper presents the first formal definition of patterns discovered by code summarization models (i.e. models that predict the name of a method given its body), and gives a sound algorithm to infer a context-free grammar (CFG) that formally describes the learned patterns. We realize our approach in PATIC which produces CFGs for summarizing the patterns discovered by code summarization models. In particular, we pick two prominent instances, code2vec and code2seq, to evaluate PATIC. PATIC shows that the patterns extracted by each model are heavily restricted to local, and syntactic code structures with little to none semantic implication. Based on these findings, we present two example uses of the formal definition of patterns: a new method for evaluating the robustness and a new technique for improving the accuracy of code summarization models. Our work opens up this exciting, new direction of studying what models have learned from source code.